55 research outputs found

    Probabilistic models of information retrieval based on measuring the divergence from randomness

    Get PDF
    We introduce and create a framework for deriving probabilistic models of Information Retrieval. The models are nonparametric models of IR obtained in the language model approach. We derive term-weighting models by measuring the divergence of the actual term distribution from that obtained under a random process. Among the random processes we study the binomial distribution and Bose--Einstein statistics. We define two types of term frequency normalization for tuning term weights in the document--query matching process. The first normalization assumes that documents have the same length and measures the information gain with the observed term once it has been accepted as a good descriptor of the observed document. The second normalization is related to the document length and to other statistics. These two normalization methods are applied to the basic models in succession to obtain weighting formulae. Results show that our framework produces different nonparametric models forming baseline alternatives to the standard tf-idf model

    Contexts as Relativized Definitions: A Formalization Via Fixed Points

    No full text
    We present a novel account of contexts, formalized as fixed point equations in the modal logic QKD4Z. This offers the ability to represent consistency and provability at the object level, with which one can then represent various relationships that must hold between different contexts, such as inheritance, disjointness, compatibility, etc. The logic also offers the ability to name contexts and to obtain explicit sentential representations for these by solving suitable fixed point equations. We illustrate our approach by examples concerning default inheritance of contexts, contradictory contexts, integration of multiple contexts. 1 Introduction A context is the snapshot of a current state of affairs, to which an intelligent agent relativizes his reasoning. Agent reasoning is local and indexical, in the sense that truth depends on the actual embedded situation in which the agent is or on his specific state of mind. A typical example is when the same term denotes different objects..
    corecore